Covariance Assisted Screening and Estimation.
نویسندگان
چکیده
Consider a linear model Y = X β + z, where X = Xn,p and z ~ N(0, In ). The vector β is unknown and it is of interest to separate its nonzero coordinates from the zero ones (i.e., variable selection). Motivated by examples in long-memory time series (Fan and Yao, 2003) and the change-point problem (Bhattacharya, 1994), we are primarily interested in the case where the Gram matrix G = X'X is non-sparse but sparsifiable by a finite order linear filter. We focus on the regime where signals are both rare and weak so that successful variable selection is very challenging but is still possible. We approach this problem by a new procedure called the Covariance Assisted Screening and Estimation (CASE). CASE first uses a linear filtering to reduce the original setting to a new regression model where the corresponding Gram (covariance) matrix is sparse. The new covariance matrix induces a sparse graph, which guides us to conduct multivariate screening without visiting all the submodels. By interacting with the signal sparsity, the graph enables us to decompose the original problem into many separated small-size subproblems (if only we know where they are!). Linear filtering also induces a so-called problem of information leakage, which can be overcome by the newly introduced patching technique. Together, these give rise to CASE, which is a two-stage Screen and Clean (Fan and Song, 2010; Wasserman and Roeder, 2009) procedure, where we first identify candidates of these submodels by patching and screening, and then re-examine each candidate to remove false positives. For any procedure β̂ for variable selection, we measure the performance by the minimax Hamming distance between the sign vectors of β̂ and β. We show that in a broad class of situations where the Gram matrix is non-sparse but sparsifiable, CASE achieves the optimal rate of convergence. The results are successfully applied to long-memory time series and the change-point model.
منابع مشابه
An Experience of Qualified Preventive Screening: Shiraz Smart Screening Software
Background: Computerized preventive screening software is a cost effective intervention tool to address non-communicable chronic diseases. Shiraz Smart Screening Software (SSSS) was developed as an innovative tool for qualified screening. It allows simultaneous smart screening of several high-burden chronic diseases and supports reminder notification functionality. The extent in which SSSS affe...
متن کاملSpatial Channel Covariance Estimation for the Hybrid MIMO Architecture: A Compressive Sensing Based Approach
Spatial channel covariance information can replace full knowledge of the entire channel matrix for designing analog precoders in hybrid multiple-input-multiple-output (MIMO) architecture. Spatial channel covariance estimation, however, is challenging for the hybrid MIMO architecture because the estimator operating at baseband can only obtain a lower dimensional pre-combined signal through fewer...
متن کاملEvaluation and error analysis: Kalman gain regularization versus covariance regularization
Ensemble size is critical to the efficiency and performance of the ensemble Kalman filter, but when the ensemble size is small, the Kalman gain generally cannot be well estimated. To reduce the negative effect of spurious correlations, a regularization process applied on either the covariance or the Kalman gain seems to be necessary. In this paper, we evaluate and compare the estimation errors ...
متن کاملAlmost Sure Convergence Rates for the Estimation of a Covariance Operator for Negatively Associated Samples
Let {Xn, n >= 1} be a strictly stationary sequence of negatively associated random variables, with common continuous and bounded distribution function F. In this paper, we consider the estimation of the two-dimensional distribution function of (X1,Xk+1) based on histogram type estimators as well as the estimation of the covariance function of the limit empirical process induced by the se...
متن کاملStructure of Wavelet Covariance Matrices and Bayesian Wavelet Estimation of Autoregressive Moving Average Model with Long Memory Parameter’s
In the process of exploring and recognizing of statistical communities, the analysis of data obtained from these communities is considered essential. One of appropriate methods for data analysis is the structural study of the function fitting by these data. Wavelet transformation is one of the most powerful tool in analysis of these functions and structure of wavelet coefficients are very impor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Annals of statistics
دوره 42 6 شماره
صفحات -
تاریخ انتشار 2014